An Expressive Mandarin Speech Corpus

نویسندگان

  • Jianhua Tao
  • Jian Yu
  • Yongguo Kang
چکیده

The paper introduces an expressive mandarin speech corpus, which is supported by National Hi-tech program (863) and National Science Funding of China (NSFC), for research into expressive speech information processing. The corpus contains emotional speech, dialogue speech, etc. In order to get the subtle acoustic information, the paper also presents the annotation methods with multiple perception results for emotional speech. Furthermore, some acoustic analysis results are also discussed. The corpus has been proved very useful used in our research, on both emotional speech processing and spoken language synthesis/understanding.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling the Acoustic Correlates of Dialog Act for Expressive Chinese Tts Synthesis

This paper proposed a novel approach for describing the expressivity of dialog text and modelling their acoustic correlates for expressive text-to-speech (TTS) synthesis. We applied the Dialog Acts (DAs) in describing expressivity. In particular, we set up a Wizard-of-Oz (WoZ) data collection framework to collect the tourism domain corpus and annotated the DAs. A Pitch Target model which is opt...

متن کامل

The Blizzard Challenge 2008

The Blizzard Challenge 2008 was the fourth annual Blizzard Challenge. This year, participants were asked to build two voices from a UK English corpus and one voice from a Mandarin Chinese corpus. This is the first time that a language other than English has been included and also the first time that a large UK English corpus has been available. In addition, the English corpus contained somewhat...

متن کامل

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, including audio capturing devices and environments are presented in details. The preparation of the related resources, including transcriptions and lexic...

متن کامل

Design of Speech Corpus for Mandarin Text to Speech

This paper introduces the CASIA Mandarin corpus designed for Mandarin speech synthesis research. It has been carefully recorded by a professional female speaker under studio conditions. The corpus contains 5000 phonetic context balanced sentences with about 7 hours. The text transcription with word boundaries, POS tags and pronunciation are also involved. The final corpus has been delivered to ...

متن کامل

HKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus

The paper describes the design, collection, transcription and analysis of 200 hours of HKUST Mandarin Telephone Speech Corpus (HKUST/MTS) from over 2100 Mandarin speakers in mainland China under the DARPA EARS framework. The corpus includes speech data, transcriptions and speaker demographic information. The speech data include 1206 ten-minute natural Mandarin conversations between either stran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006